Tables and Figures Detection Using Layout Parser
================================================
.. note::
View the complete implementation in Google Colab: Open Notebook `Tables and Figures Detection Notebook <https://colab.research.google.com/github/MasrourTawfik/Textra_research_v1/blob/main/docs/notebooks/tables_figures_detections.ipynb>`_

Introduction
-------------
Layout Parser is a toolkit for Document Layout Analysis that helps detect and extract various elements from documents, including tables, figures, text blocks, and more. It uses deep learning models trained on large datasets like PubLayNet to identify different components in document images.

Installation
-------------
Install LayoutParser and its dependencies:

.. code-block:: bash

   pip install layoutparser
   pip install "detectron2@git+https://github.com/facebookresearch/detectron2.git@v0.5#egg=detectron2" #if you are encountring any problem with this installation refer to readme.md
   pip install "layoutparser[layoutmodels]"

Components
------------

Model Initialization
~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

    model = lp.Detectron2LayoutModel(
        'lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config',
        extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", min(table_threshold, figure_threshold)],
        label_map={0: "Text", 1: "Title", 2: "List", 3: "Table", 4: "Figure"}
    )

for customization:
- Modify ``label_map`` to detect different elements
- Adjust thresholds for detection sensitivity
- Use different pre-trained models (e.g., ``lp://PrimaLayout/mask_rcnn_R_50_FPN_3x`` for historical documents)

Block Type Detection
~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

    def get_block_type(block):
        """Helper function to safely get block type from layout detection"""
        if hasattr(block, 'type'):
            return block.type
        
        if hasattr(block, 'label'):
            if isinstance(block.label, str):
                return block.label
            if isinstance(block.label, (int, float)):
                type_mapping = {
                    0: 'Text',
                    1: 'Title',
                    2: 'List',
                    3: 'Table',
                    4: 'Figure'
                }
                return type_mapping.get(int(block.label), 'Unknown')
        
        return 'Unknown'

for customization:
- Add new types to ``type_mapping``
- Modify return values for different classification needs
- Add custom type detection logic

Visualization
~~~~~~~~~~~~~

.. code-block:: python

    def create_visualization(image, detected_elements, show_plot=True):
        """Create visualization of detected tables and figures"""
        viz_image = image.copy()
        draw = ImageDraw.Draw(viz_image)
        
        # Customize colors and labels for different element types
        element_styles = {
            'tables': {'color': 'red', 'label': 'Table'},
            'figures': {'color': 'green', 'label': 'Figure'}
        }


Detection Processing
~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

    def process_single_page(image_path, table_threshold=0.3, figure_threshold=0.8):
        """Process a single page to detect tables and figures"""

parameters to adjust:
- ``table_threshold``: Lower values detect more tables but may increase false positives
- ``figure_threshold``: Higher values ensure more confident figure detection
- new thresholds for more element types


Usage Examples
------------

Basic usage with default thresholds:

.. code-block:: python

    result = process_single_page("path/to/document.png")

Adjust detection sensitivity:

.. code-block:: python

    # More lenient detection
    result_lenient = process_single_page(
        "path/to/document.png",
        table_threshold=0.1,
        figure_threshold=0.6
    )

    # Stricter detection
    result_strict = process_single_page(
        "path/to/document.png",
        table_threshold=0.5,
        figure_threshold=0.9
    )